Objective

Data

Read Dataset

View Dataset

Duplication check

Dropping uneeded columns

Since we have already checked that all values are unique to a customer, CLIENTNUM is useless to us statistically, we will drop it.

Understanding shape

Checking data type

Fixing data type

Missing Value check

Categorical data value check

Fixing wrong categories

Dataset Summary

EDA

Univariate Analysis

Bivariate Analysis

Summary of EDA

Data Description:

Data Cleaning:

Observations from EDA

Data preparation for modeling

Outlier value treatment

Splitting Data

Missing Value treatment

Outlier Treatment

Model Building

Logistic Regression

Decision Tree

Bagging

ADABoost

Gradient Boost

XGBoost

Oversampling using SMOTE

Logistic regression over sampling

Decision Tree Oversampling

Bagging Oversampling

Adaboost Oversample

GradientBoost Oversampling

XGB Oversampling

Undersampling using Random Undersampler

Logistic regression under sampling

Decision Tree under sampling

Bagging under sampling

ADABoost under sampling

GradientBoost under sampling

XGB under sampling

Model Comparison

Model Tuning

AdaBoost Classifier

GridSearchCV

RandomizedsearchCV

GradientBoost Classifier

GridSearchCV

RandomizedsearchCV

Building model with best parameters

XGB Classifier

GridSearchCV

RandomizedsearchCV

Comparing tuned models

Performance on test set

Pipeline Creation

Business Recommendations